11:23
2026-05-22
dev.to
artificial-intelligence
Cutting LTX-2 22B Peak VRAM by 40% with fp8_cast β and Why optimum-quanto Was a Trap
The author successfully reduced peak VRAM usage of the LTX-2 22B video generation model from 40 GiB to 24 GiB using the model's native `fp8_cast` quantization method. In contrast, the author found thaβ¦